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Abstract 



The capacity of non-coherent stationary Gaussian fading channels with 
memory under a peak-power constraint is studied in the asymptotic weak- 
lyj \ signal regime. It is assumed that the fading law is known to both trans- 

O ■ mitter and receiver but that neither is cognizant of the fading realization. 

A connection is demonstrated between the asymptotic behavior of 
channel capacity in this regime and the asymptotic behavior of the pre- 
diction error incurred in predicting the fading process from very noisy 
observations of its past. This connection can be viewed as the low signal- 
fT^ , to-noise ratio (SNR) analog of recent results by Lapidoth & Moser and 

f^ ' by Lapidoth demonstrating connections between the high SNR capacity 

^^ ' growth and the noiseless or almost-noiseless prediction error. 

^^ , We distinguish between two families of fading laws: the "slowly forget- 

>0 ' ting" and the "quickly forgetting". For channels in the former category 

the low SNR capacity is achieved by IID inputs, whereas in the latter such 
C/5 . inputs are typically sub-optimal. Instead, the asymptotic capacity can be 

vJ ' approached by inputs with IID phase but block-constant magnitude. 



1 Introduction 

We present results on the low signal-to-noise ratio (SNR) asymptotic capacity 
of discrete-time peak-power limited single-antenna fading channels. The fading 
process is assumed to be a zero-mean unit-variance circularly-symmetric station- 
ary complex Gaussian process with memory whose law — but not realization — is 
known to both the transmitter and the receiver. Input distributions that ap- 
proach or achieve the asymptotic capacity are also studied. 

Previous work on this channel (3 , [Sj , [H] focused on the behavior of capacity 
at high SNR. For regular fading processes [7] demonstrated a connection between 
the high-SNR capacity growth and the prediction error in predicting the fading 
process from noiseless observations of its past, whereas for non-regular fading |S] 
demonstrated such a connection to the functional dependence on the observation 



noise variance of the prediction error in predicting the fading process from noisy 
observations of its past in the low observation noise regime. 

Here we point to an analog connection between the low SNR asymptotic 
capacity and the prediction error in predicting the fading process from very 
noisy observations of its past. We show that, for most channels of interest, the 
limiting ratio of channel capacity to the square of the SNR is fully determined by 
the derivative at SNR = of the prediction error in predicting the fading process 
based on observations of its past contaminated by IID Gaussian observation 
noise of variance 1/SNR. 

Denoting this derivative by (—</>) we distinguish between two families of 
fading laws: the "slowly forgetting" where </> > 1/2 and the "quickly forgetting" 
where < </> < 1/2. For the former family IID inputs are asymptotically 
optimal, whereas for the latter IID inputs are typically (for (f> > 0) sub-optimal. 
To approach capacity on "quickly forgetting" fading channels we propose to use 
inputs of IID phase but block-constant magnitude. 

When the fading spectral distribution function is discontinuous, i.e., when 
the fading has spectral lines, (j) is infinite. In this case capacity at low SNR 
typically scales linearly with the SNR, with the limiting ratio of channel capacity 
to the SNR being the sum of the jumps in the spectral distribution function |12) . 

Previous work on the low SNR capacity of this channel dealt with the block 
fading model (which in the special case where the fading block size is one cor- 
responds to memoryless fading for which = 0) |2] and with the information 
rates achievable with fixed finite dimensional input distributions [S]. Our work 
relies heavily on both these papers. 

It should be emphasized that this paper only deals with peak-power con- 
straints. If those are replaced with average power constraints, capacity at low 
SNR typically scales linearly with the SNR; see ^Uli |H]i C3 and references 
therein. 



2 Channel Model and Main Results 

2.1 Channel Model 

We consider a discrete-time channel whose time-fc complex-valued output Yk G 
C is given by 

Yk = HkXk + Zk (1) 

where x^ G C is the complex-valued channel input at time fc; the complex 
process {Hi^} models multiplicative noise; and the complex process {Z^} models 
additive noise. 

We assume that the additive noise sequence {Zk} is a sequence of IID circu- 
larly symmetric complex- Gaussian random variables of mean zero and variance 
a^ where a^ > 0. Such a Gaussian distribution is denoted by A/'c(0,ct^). The 
"fading process" {Hk} is assumed to be a zero-mean, unit-variance, stationary, 
circularly symmetric, Gaussian process. We denote its autocorrelation function 



by R{-) and its spectral distribution function by F{-). Thus 

»l/2 



Rim) ^ E[Hk+rrM] = f ' e'2-™^di^(A), k,meZ (2) 

J-l/2 



nl/2 



'-1/2 

and 

i?(0) = E[|i7fep] = /"'' dF(A) = l. (3) 

J-l/2 

We denote the derivative of the spectral distribution function by F'(-). If 
the spectral distribution function F(-) is absolutely continuous then the fading 
process has a spectral density. In this case the spectral density is given by i^'(-) 
and will be denoted by /(•). Thus, if the fading has a spectral density /(■) then 

.1/2 

R{m)= e'^™^/(A)dA, mel (4) 

J-l/2 

and /(•) is a non-negative function satisfying 

.1/2 

/(A)dA = l. (5) 

1/2 

We shall assume throughout that the processes {Hk] and {Zk] are indepen- 
dent and that their joint law does not depend on the input sequence {xk\- 

The input constraint that we consider is the peak-power constraint according 
to which the time-fc input x^ must satisfy 

\xk\<A (6) 

where A is a positive real number and A^ stands for the peak-power. We define 
the signal-to-noise ratio (SNR) as 

SNR^^. (7) 

The capacity of this channel is given by 

C(SNR)= lim isup/(Xi,...,X„;Yi,...,y„) (8) 

n — ^cxD ji 

where the suprcmum is over all joint distributions on Xi, . . . , X„ under which, 
with probability one, 

\Xk\<A, fc=l,...,n. (9) 

2.2 Noisy Prediction 

The least mean squared prediction error that one can attain when trying to 
predict Hq based on its infinite past (. . . , -ff-2, H-i) is given by [Q 

expi / logi^'(A)dAi. (10) 



More generally, if we try to predict Hq based on (. . . , iJ_2 + W-2, H_i + W-i) 
where {Wk} are IID Ac (0,(5^) and independent of {Hk} then the least mean 
squared prediction error ei^g^{S'^) is given by [SI 

elUS') - c^P |/f^ log (P'W + S') dAJ - S^ (11) 

The following lemma describes the asymptotics of the noisy prediction error 
from very noisy observations of the past: 

Lemma 1. // the fading process is of spectral density function /(A) > satis- 
fying (jSJ and 

.1/2 

/ /2(A)dA<(X) (12) 

J-l/2 

then 

0Aiimi^i^IHliiM (13) 

pio p 



is defined and is given by 

.1/2 
1/2 

which can also be expressed using Parseval's Theorem as 



-/ /2(A) dA-- (14) 

^ J-l/2 ^ 



</. = ^|i?(^)p. (15) 

i/=i 

Thus, 

epred(l//') = l-'/>-P+0(p) (16) 

where the error term o{p) is small enough so that o (p) / p tends to zero as p 
tends to zero. 

Proof. Using Hll|l we have 



1"S 



,ed(l/p) _ 1 - (^'^P {l-i% l°g(/(^) + P^') d^} - ^) 



1 - (exp {j%[\og{l + pfiX)) + log(l/p)] dAJ - 1 

P 

^ + ;^ - ^ c^P {/'fj^^(^) " Ip'fW + A(A, p)) dx\ 

-+ cxp\p--p' /2(A)dA+/ A(A,p) dA 

P P^ P^ I 2 7_i/2 7-1/2 ^ 



where 



A(A, p) ^ log(l + p/(A)) - (p/(A) - Ip'fiX)) . 



The result now follows by a second order Taylor expansion of the exponential 
function and from the fact that 

limr^%^dA^O. (17) 

PiO 7-1/2 P^ 

This latter fact can be proved using the Monotone Convergence Theorem (MCT) . 
Indeed, for any fixed A £ [—1/2, 1/2] we have by Taylor's expansion applied to 
the function £_ >-> log(l + £_) that the integrand converges to zero as p J. 0. 
Moreover, it can be verified that for such fixed A the integrand is monotonically 
decreasing to zero as p J, 0. Finally, for p = 1 the integrand is an integrable 
function. D 

2.3 Main Results 

The asymptotic behavior of the capacity at low SNR for fading processes having 
a spectral density function is given in the following theorem. 

Theorem 1. // the stationary zero-mean circularly- symmetric complex Gaus- 
sian fading process has a spectral distribution function /(A) > satisfying 10) 
and ifT^ then 

C(SNR) _ |i(20 + l)2 «/0<l 



lim ' :' ^ <^ 8v -^ ' / J^ - 2 ng) 

SNRiO SNR^ \(j) if(i)>\ 

where 4> is given in H14(l or H15|l . 

We refer to fading processes for which (p > 1/2 as "slowly forgetting" and 
to those with < (p < 1/2 as "quickly forgetting". In the former case when 
(j) > 1/2 the asymptotic capacity is achieved by IID inputs taking the values ±A 
with equal probability. For the latter case when < < 1/2 the asymptotic 
capacity is typically not achieved by IID inputs |12j . In such cases, information 
rates that at low SNR get arbitrarily close to the capacity can be achieved by 
inputs of IID phase but block-constant magnitude. 

When the spectral distribution function is discontinuous, the fading process 
has spectral lines and does not have a spectral density function. Nevertheless, 
being a monotonic function, the spectral distribution function is differentiable 
except on a set of measure zero. In this case capacity typically increases linearly 
with the SNR in the low SNR regime with the limiting ratio of capacity to SNR 
being the sum of the jumps in the spectral distribution function F{-) [T^ . 

Theorem^will be proved in Section|2|by exhibiting upper and lower bounds 
on channel capacity that at low SNR coincide. 



3 Proof of Theorem [T] 

In this section wc prove Theorem ^ by deriving an upper bound and a lower 
bound on channel capacity that coincide at low SNR. We begin with the upper 
bound. 

3.1 Proof of the Upper Bound 

Our first steps towards an upper bound on channel capacity are similar to those 
of |S]. We begin with the chain rule: 



n 

= J2 I {Xi, Y^'-Yu) - I {Y,;Yt') 

n 

<Y,l{X^,Y^-'-Yu) 

n 

= Y.nxtYt';Y,) 

n 

= J2l{Xk;Yk) + l{xt\Yt';Yk\Xk). (19) 

fc=i 

Consequently we have by © 

C(SNR) < sup|/ (Xo; Fq) + / {XZ^, YZ^;Yo\Xo) } (20) 

px„ ^ J 

where the term I{Xq; Yq) is the mutual information corresponding to the input 
distribution px„ for the memoryless Rayleigh fading channel. By |2] 

/ (Xo; Fo) = E[|Xo|^]-jE[|XoP])^ ^ ^ ^^^^,^ ^^^^ 

where o (SNR^) /SNR^ -> as SNR -> 0. 

We now study the second term on the right hand side of H20() . As in [^j we 
introduce random variables {Wfc} that are IID A/c(0, 1/SNR) and independent 
of {Hk} to model the observation noise. Denoting by E[-] the mathematical 
expectation we have 

^(-'^-oo'^-ooS^ol-'^o) = h {Yo\Xq) - h (Yo\Xq, X_^,Y_^) 

< h {Yo\Xo) - h (Yo\Xo, {H, + W,]-t_ 



log 



^' + l^o|2-e^,ed(l/SNR) 



'T^ + EilXol- 
log 






(t2 + E[|Xo|2] • (1 - • SNR + o(SNR)) 

< (j) ■ SNR • ^' °' ^ + o (SNR2) . (22) 

cr 

Here the first inequality follows because, given tlic present input, all the informa- 
tion about the present output that is contained in the past inputs and outputs 
is maximized when all the past inputs have maximum magnitude; the subse- 
quent equality follows from the explicit expression for the differential entropy of 
circularly-symmetric Gaussians and because we assumed that the fading process 
is of unit- variance; the subsequent inequality follows from Jensen's inequality, 
because {a'^ + r)/{a^ + e^r) is concave in r £ [0, oo) when e^ < 1 and therefore 
so is its logarithm; the subsequent equality by Lemma ^ and the final step by 
expressing the logarithm of the ratio as the difference of two logarithms and 
by expanding each of the logarithms into its Taylor series expansion keeping in 
mind that by the peak-constraint \Xo\'^/a'^ < SNR. 
By ^, (ini, and ^ we obtain 

c<»p|iM2!!MiM2!!l)!^.,.sNR.iMU„(SNR^) ,23) 

< sup <^ ^' ' \ ^ ^' ' ^^ + (j) ■ SNR • ^' ^ ' ^ \ + o (SNR2) (24) 

where the second inequality follows by noting that, by the peak-constraint ^, 
we have < \Xo\ < A so that E[|Xo|*] < A^E [|^oP] ■ Also by © we have that 

so that if we introduce 

a^E[\Xo\']/A^ (26) 

then < a < 1. Consequently we have from (|24|) that 

C< sup | "~" + ■ al ■ SNR^ + o (SNR^) . (27) 

o<Q<i 12 J 

When (p < 1/2, the supremum on the RHS of H27|l is achieved inside the 
interval [0, 1] by a* = + i to yield 

C<I?i±ll!sNR2 + o(SNR2), 0<1 (28) 

8 2 

and when > 1/2 the supremum is achieved at a* = 1 to yield 

C < • SNR^ + o (SNR^) , ^^0- *^^^) 

In both inequalities o (SNR^) /SNR^ ^ as SNR -^ 0. 

7 



3.2 Proof of the Lower Bound 

We shall next exhibit a lower bound on channel capacity that at low SNR 
coincides with the upper bound H27|) . More specifically, for any < a < 1 of 
our choice and for any integer 6 > 1 we shall propose for every peak-amplitude 
^ > a distribution on {^fe}^i satisfying the peak-constraint 10 such that 

lim lim " V ^ ^— +0-a, < a < 1. (30) 

b^oo SNRiO SNR 2 

The proposed distribution on {Xk} can be described as follows: 

Xk = U^k/bl-Dk, fc=l,2,... (31) 

where [^J denotes the largest integer not exceeding ^; {Dk}'^i arc IID taking 
the values ±1 equi-probably; {[/,y}J^o ^^^ ^^^ taking the value A with proba- 
bility a and the value with probability (1 — a); and the sequences {i'/tj^i 
and {U^}'^Q are independent. 

We shall next show that for this distribution 

lim -l(x^; yA > jI{Xu . . . , X,; Fi, . . . , n). (32) 

This is a simple consequence of the fact that {X'-'^^} arc IID where X^'^'' is the 
v-th tuple 

X^"' = [X^b+i, ■ ■ ■ ,X„b+b), 1^ ~ 0,1, . .. 

and where we analogously define 

Y^"^ ^{Y,b+i,...,Y,b+b), iy^O,l,... 
Indeed, defining m = \n/b\ wc have 

-l{X[^;Yr) > ^lf{X^'^)}^-,' ; yA 

^ m—l 

^ T7l— 1 



n 



= !^/fx(o):y(«) 



n 



= ^^^I{X,,...,Xb;Y,,...,Yb, 
n 

from which H32(l follows upon letting n tend to infinity. 



Having proved H32|) we now have that for our proposed input distribution 

(33) 



^.^^^ lim«^oo^/(^r;yi") ^1 j.^^^ I{X,,...,X,;Y^,...,n) 



SNRiO 



SNR^ 



SNRiO 



SNR^ 



We next use CoroUary 1 of ^ to study the RHS of the above noting that for 
the proposed input distribution l|31(l 



E[x,x;] 



a- A^ if fc =: j 







otherwise 



(34) 



and 



E[\Xk\'\X,\^]^a-A\ j,fce {!,..., 5}. (35) 

Let X denote the vector {Xi, X2, ■ ■ ■ , XbY , and let H denote the diagonal 
matrix with diagonal entries Hi, H2, ■ ■ ■ , Hb, then 

1 



/I 



X'l;Yl') :=— jtr{E[(E[HXXtHt|X])^] - (E[HE[XXt] Ht])^ + o (SNR^) 



= 2^*"^^ 



i?(0)|Xi|2 
R{b - l)XbXl 



R{1 - b)XiX* 

Rmxbi^ 



\ 



B.'' 



^\Xb?] J 



o SNR-" 



^Jj:i:\m-j)mx.nx,\'] 

\i=l = 1 

-^|i?(0)|^(E[|X.n)^)+o(SNR2) 
j=i / 

^ ("■^'EEl^(*--?)l'-^-"'-^^) +o(SNR2) 
-WR^{h-{a-a^)+aY^ ^ |i?(i - j)^) + o (SNR^) 



i=i i<j<b 



2 

= isNR^ (6 • (a - a^) + aS'(6)) + o (SNR^) 
where in the last equality we defined 

i=l l<j<6 



(36) 



(37) 



Dividing (|36|l by b we obtain 

i/ (Xf ; Yi") = i L - a^ + a^) SNR^ + o (SNR^) . (38) 

Using H38(l and 1)33(1 the proof of H30() is completed by noting that 

Urn ^ = 2(/.. (39) 

b^oo 

which follows from a direct calculation showing that 

b 

S{b+l)-S{b) = 2Y,\Riv)\' (40) 

,,=1 

-> 20 (as b^oo) (41) 

and a Cesaro argument. 

Remark 1. When (f> > 1/2, i/ie optimal choice for a is 1. This corresponds to 
the sequence {Ui,} in ()31|l being deterministically equal to A, thus resulting in 
our proposed input sequence {Xk} being IID. It is thus seen that for (p > 1/2 
the asymptotic capacity at low SNR is achieved by IID inputs. For (f) < 1/2 
our proposed input distribution is not IID. Indeed, for < (f> < 1/2 IID inputs 
typically do not achieve the low SNR capacity \1^ . 
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